智能论文笔记

Learning Branched Fusion and Orthogonal Projection for Face-Voice Association

Muhammad Saad Saeed , Shah Nawaz , Muhammad Haris Khan , Sajid Javed , Muhammad Haroon Yousaf , Alessio Del Bue

分类：计算机视觉

2022-08-22

近年来，人们对建立面孔和名人声音之间的关联的兴趣越来越大，从而利用YouTube的视听信息。先前的工作采用公制学习方法来学习适合关联匹配和验证任务的嵌入式空间。尽管显示出一些进展，但由于依赖距离依赖的边缘参数，运行时训练的复杂性差以及对精心制作的负面采矿程序的依赖，这种制剂是限制性的。在这项工作中，我们假设一个丰富的表示形式以及有效但有效的监督对于实现面部voice关联任务的歧视性关节嵌入空间很重要。为此，我们提出了一种轻巧的插件机制，该机制利用这两种方式中的互补线索以通过正交性约束来根据其身份标签形成丰富的融合杂物并将其簇形成。我们将我们提出的机制作为融合和正交投影（FOP）创造，并在两个流网络中实例化。在Voxceleb1和Mav-Celeb数据集上评估了总体结果框架，其中包括许多任务，包括跨模式验证和匹配。结果表明，我们的方法对当前的最新方法有利，而我们提出的监督表述比当代方法所采用的方法更有效。此外，我们还利用跨模式验证和匹配任务来分析多种语言对面部声音协会的影响。代码可用：\ url {https://github.com/msaadsaeed/fop}

translated by 谷歌翻译

Fusion and Orthogonal Projection for Improved Face-Voice Association

Muhammad Saad Saeed , Muhammad Haris Khan , Shah Nawaz , Muhammad Haroon Yousaf , Alessio Del Bue

分类：计算机视觉

2021-12-20

我们研究了脸部和声音之间学习协会的问题，这是最近对计算机视觉界的兴趣。现有作品采用成对或三重态损耗配方，以学习适用于相关匹配和验证任务的嵌入空间。尽管展示了一些进展，但这种损失配方由于依赖差距利润率参数，运行时训练复杂性差，以及依赖于仔细制作的负挖掘程序而受到限制。在这项工作中，我们假设具有有效且有效的监督耦合的富集的特征表示是实现改进的面部语音关联的鉴别性关节嵌入空间。为此，我们提出了一种轻量级，即插即用机制，可利用两种方式的互补线程来形成丰富的融合嵌入并通过正交限制基于其身份标签进行群集。我们将我们提出的机制硬币作为融合和正交投影（FOP），并在两条流管道中实例化。在具有多种任务的大规模VOXECEB数据集上评估总体产生的框架，包括跨模型验证和匹配。结果表明，我们的方法对目前的最先进的方法进行了有利，我们拟议的监督制定比当代方法所采用的制定更有效和效率。

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

HistoSeg : Quick attention with multi-loss function for multi-structure segmentation in digital histology images

Saad Wazir , Muhammad Moazam Fraz

分类：计算机视觉

2022-09-01

医疗图像分割有助于计算机辅助诊断，手术和治疗。数字化组织载玻片图像用于分析和分段腺，核和其他生物标志物，这些标志物进一步用于计算机辅助医疗应用中。为此，许多研究人员开发了不同的神经网络来对组织学图像进行分割，主要是这些网络基于编码器编码器体系结构，并且还利用了复杂的注意力模块或变压器。但是，这些网络不太准确地捕获相关的本地和全局特征，并在多个尺度下具有准确的边界检测，因此，我们提出了一个编码器折叠网络，快速注意模块和多损耗函数（二进制交叉熵（BCE）损失的组合），焦点损失和骰子损失）。我们在两个公开可用数据集上评估了我们提出的网络的概括能力，用于医疗图像分割Monuseg和Glas，并胜过最先进的网络，在Monuseg数据集上提高了1.99％的提高，而GLAS数据集则提高了7.15％。实施代码可在此链接上获得：https：//bit.ly/histoseg

translated by 谷歌翻译

Blind-Spot Collision Detection System for Commercial Vehicles Using Multi Deep CNN Architecture

Muhammad Muzammel , Mohd Zuki Yusoff , Mohamad Naufal Mohamad Saad , Faryal Sheikh , Muhammad Ahsan Awais

分类：计算机视觉

2022-08-17

与汽车和其他公路车辆相比，公共汽车和重型车辆由于其尺寸较大而具有更多的盲点。因此，这些重型车辆造成的事故更具致命性，并给其他道路使用者造成严重伤害。这些可能的盲点碰撞可以使用基于视觉的对象检测方法来尽早确定。然而，现有的基于最新视觉的对象检测模型在很大程度上依赖于单个功能描述符来做出决策。在这项研究中，提出了基于高级功能描述符的两个卷积神经网络（CNN）的设计，并提出了它们与更快的R-CNN的集成，以检测重型车辆的盲点碰撞。此外，提出了一种融合方法，以整合两个预训练的网络（即Resnet 50和Resnet 101），用于提取高水平的特征以进行盲点车辆检测。功能的融合显着提高了更快的R-CNN的性能，并优于现有的最新方法。两种方法均在公共汽车的自我录制的盲点车辆检测数据集和用于车辆检测的在线LISA数据集上进行了验证。对于两种提出的方法，对于自记录的数据集，可获得3.05％和3.49％的虚假检测率（FDR），使这些方法适用于实时应用。

translated by 谷歌翻译

BIO-CXRNET: A Robust Multimodal Stacking Machine Learning Technique for Mortality Risk Prediction of COVID-19 Patients using Chest X-Ray Images and Clinical Data

Tawsifur Rahman , Muhammad E. H. Chowdhury , Amith Khandakar , Zaid Bin Mahbub , Md Sakib Abrar Hossain , Abraham Alhatou , Eynas Abdalla , Sreekumar Muthiyal , Khandaker Farzana Islam , Saad Bin Abul Kashem

分类：计算机视觉 | 机器学习

2022-06-15

快速准确地检测该疾病可以大大帮助减少任何国家医疗机构对任何大流行期间死亡率降低死亡率的压力。这项工作的目的是使用新型的机器学习框架创建多模式系统，该框架同时使用胸部X射线（CXR）图像和临床数据来预测COVID-19患者的严重程度。此外，该研究还提出了一种基于nom图的评分技术，用于预测高危患者死亡的可能性。这项研究使用了25种生物标志物和CXR图像，以预测意大利第一波Covid-19（3月至6月2020年3月至6月）在930名Covid-19患者中的风险。提出的多模式堆叠技术分别产生了89.03％，90.44％和89.03％的精度，灵敏度和F1分数，以识别低风险或高危患者。与CXR图像或临床数据相比，这种多模式方法可提高准确性6％。最后，使用多元逻辑回归的列线图评分系统 - 用于对第一阶段确定的高风险患者的死亡风险进行分层。使用随机森林特征选择模型将乳酸脱氢酶（LDH），O2百分比，白细胞（WBC）计数，年龄和C反应蛋白（CRP）鉴定为有用的预测指标。开发了五个预测因素参数和基于CXR图像的列函数评分，以量化死亡的概率并将其分为两个风险组：分别存活（<50％）和死亡（> = 50％）。多模式技术能够预测F1评分为92.88％的高危患者的死亡概率。开发和验证队列曲线下的面积分别为0.981和0.939。

translated by 谷歌翻译

A Survey on Training Challenges in Generative Adversarial Networks for Biomedical Image Analysis

Muhammad Muneeb Saad , Ruairi O'Reilly , Mubashir Husain Rehmani

分类：机器学习 | 计算机视觉

2022-01-19

In biomedical image analysis, the applicability of deep learning methods is directly impacted by the quantity of image data available. This is due to deep learning models requiring large image datasets to provide high-level performance. Generative Adversarial Networks (GANs) have been widely utilized to address data limitations through the generation of synthetic biomedical images. GANs consist of two models. The generator, a model that learns how to produce synthetic images based on the feedback it receives. The discriminator, a model that classifies an image as synthetic or real and provides feedback to the generator. Throughout the training process, a GAN can experience several technical challenges that impede the generation of suitable synthetic imagery. First, the mode collapse problem whereby the generator either produces an identical image or produces a uniform image from distinct input features. Second, the non-convergence problem whereby the gradient descent optimizer fails to reach a Nash equilibrium. Thirdly, the vanishing gradient problem whereby unstable training behavior occurs due to the discriminator achieving optimal classification performance resulting in no meaningful feedback being provided to the generator. These problems result in the production of synthetic imagery that is blurry, unrealistic, and less diverse. To date, there has been no survey article outlining the impact of these technical challenges in the context of the biomedical imagery domain. This work presents a review and taxonomy based on solutions to the training problems of GANs in the biomedical imaging domain. This survey highlights important challenges and outlines future research directions about the training of GANs in the domain of biomedical imagery.

translated by 谷歌翻译

Neural Networks for Infectious Diseases Detection: Prospects and Challenges

Muhammad Azeem , Shumaila Javaid , Hamza Fahim , Nasir Saeed

分类：机器学习

2021-12-07

人工神经网络（ANN）能够学习，纠正错误和将大量原始数据转化为治疗和护理的有用医疗决策，这增加了增强患者安全和护理质量的普及。因此，本文审查了ANN的关键作用为患者医疗保健决策提供有价值的见解和有效的疾病诊断。我们彻底审查了现有文献中的不同类型的ANN，以便为复杂应用程序进行高级ANNS适配。此外，我们还调查Ann的各种疾病诊断和治疗的进步，例如病毒，皮肤，癌症和Covid-19。此外，我们提出了一种名为ConxNet的新型深度卷积神经网络（CNN）模型，用于提高Covid-19疾病的检测准确性。 ConxNet经过培训并使用不同的数据集进行测试，它达到了超过97％的检测精度和精度，这明显优于现有型号。最后，我们突出了未来的研究方向和挑战，例如算法的复杂性，可用数据，隐私和安全性，以及与ANN的生物传染集成。这些研究方向需要大幅关注改善医疗诊断和治疗应用的ANN的范围。

translated by 谷歌翻译

Designing the Architecture of a Convolutional Neural Network Automatically for Diabetic Retinopathy Diagnosis

Fahman Saeed , Muhammad Hussain , Hatim A Aboalsamh , Fadwa Al Adel , Adi Mohammed Al Owaifeer

分类：人工智能 | 计算机视觉

2021-10-08

The prevalence of diabetic retinopathy (DR) has reached 34.6% worldwide and is a major cause of blindness among middle-aged diabetic patients. Regular DR screening using fundus photography helps detect its complications and prevent its progression to advanced levels. As manual screening is time-consuming and subjective, machine learning (ML) and deep learning (DL) have been employed to aid graders. However, the existing CNN-based methods use either pre-trained CNN models or a brute force approach to design new CNN models, which are not customized to the complexity of fundus images. To overcome this issue, we introduce an approach for custom-design of CNN models, whose architectures are adapted to the structural patterns of fundus images and better represent the DR-relevant features. It takes the leverage of k-medoid clustering, principal component analysis (PCA), and inter-class and intra-class variations to automatically determine the depth and width of a CNN model. The designed models are lightweight, adapted to the internal structures of fundus images, and encode the discriminative patterns of DR lesions. The technique is validated on a local dataset from King Saud University Medical City, Saudi Arabia, and two challenging benchmark datasets from Kaggle: EyePACS and APTOS2019. The custom-designed models outperform the famous pre-trained CNN models like ResNet152, Densnet121, and ResNeSt50 with a significant decrease in the number of parameters and compete well with the state-of-the-art CNN-based DR screening methods. The proposed approach is helpful for DR screening under diverse clinical settings and referring the patients who may need further assessment and treatment to expert ophthalmologists.

translated by 谷歌翻译

Conservation Tools: The Next Generation of Engineering--Biology Collaborations

Andrew Schulz , Cassie Shriver , Suzanne Stathatos , Benjamin Seleb , Emily Weigel , Young-Hui Chang , M. Saad Bhamla , David Hu , Joseph R. Mendelson III , .

分类：机器学习

2023-01-03

The recent increase in public and academic interest in preserving biodiversity has led to the growth of the field of conservation technology. This field involves designing and constructing tools that utilize technology to aid in the conservation of wildlife. In this article, we will use case studies to demonstrate the importance of designing conservation tools with human-wildlife interaction in mind and provide a framework for creating successful tools. These case studies include a range of complexities, from simple cat collars to machine learning and game theory methodologies. Our goal is to introduce and inform current and future researchers in the field of conservation technology and provide references for educating the next generation of conservation technologists. Conservation technology not only has the potential to benefit biodiversity but also has broader impacts on fields such as sustainability and environmental protection. By using innovative technologies to address conservation challenges, we can find more effective and efficient solutions to protect and preserve our planet's resources.

translated by 谷歌翻译